VSR: A Unified Framework for Document Layout Analysis Combining Vision, Semantics and Relations

نویسندگان

چکیده

Document layout analysis is crucial for understanding document structures. On this task, vision and semantics of documents, relations between components contribute to the process. Though many works have been proposed exploit above information, they show unsatisfactory results. NLP-based methods model as a sequence labeling task insufficient capabilities in modeling. CV-based detection or segmentation but bear limitations inefficient modality fusion lack relation modeling components. To address limitations, we propose unified framework VSR analysis, combining vision, relations. supports both methods. Specifically, first introduce through image text embedding maps. Then, modality-specific visual semantic features are extracted using two-stream network, which adaptively fused make full use complementary information. Finally, given component candidates, module based on graph neural network incorported output final three popular benchmarks, outperforms previous models by large margins. Code will be released soon.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Document Structure and Layout Analysis

A document image is composed of a variety of physical entities or regions such as text blocks, lines, words, figures, tables, and background. We could also assign functional or logical labels such as sentences, titles, captions, author names, and addresses to some of these regions. The process of document structure and layout analysis tries to decompose a given document image into its component...

متن کامل

Combining Statistics and Semantics for Word and Document Clustering

A new approach for constructing pseudo-keywords, referred to as Sense Units, is proposed. Sense Units are obtained by a word clustering process, where the underlying similarity reflects both statistical and semantic properties, respectively detected through Latent Semantic Analysis and WordNet. Sense Units are used to recode documents and are evaluated from the performance increase they permit ...

متن کامل

a framework for identifying and prioritizing factors affecting customers’ online shopping behavior in iran

the purpose of this study is identifying effective factors which make customers shop online in iran and investigating the importance of discovered factors in online customers’ decision. in the identifying phase, to discover the factors affecting online shopping behavior of customers in iran, the derived reference model summarizing antecedents of online shopping proposed by change et al. was us...

15 صفحه اول

Hierarchical Fuzzy Clustering Semantics (HFCS) in Web Document for Discovering Latent Semantics

This paper discusses about the future of the World Wide Web development, called Semantic Web. Undoubtedly, Web service is one of the most important services on the Internet, which has had the greatest impact on the generalization of the Internet in human societies. Internet penetration has been an effective factor in growth of the volume of information on the Web. The massive growth of informat...

متن کامل

Combining Syntax and Semantics for Automatic Extractive Single-Document Summarization

The goal of automated summarization is to tackle the “information overload” problem by extracting and perhaps compressing the most important content of a document. Due to the difficulty that singledocument summarization has in beating a standard baseline, especially for news articles, most efforts are currently focused on multi-document summarization. The goal of this study is to reconsider the...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Lecture Notes in Computer Science

سال: 2021

ISSN: ['1611-3349', '0302-9743']

DOI: https://doi.org/10.1007/978-3-030-86549-8_8